NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Direct human interventions drive spatial variability in long-term peak streamflow trends across the United States

https://doi.org/10.1038/s43247-025-02738-8

Joseph, Jibin; Kumar, Sanjiv; Merwade, Venkatesh_M; Johnson, David_R (September 2025, Communications Earth & Environment)
Streamflow Trend Analysis

https://doi.org/10.5281/zenodo.16920792

Joseph, Jibin; Kumar, Sanjiv; Merwade, Venkatesh; Kumar, Sanjiv; Merwade, Venkatesh (January 2025, Zenodo)
Promises and Pitfalls of Generative Masked Language Modeling: Theoretical Framework and Practical Guidelines

Li, Yuchen; Kirchmayer, Alexandre; Mehta, Aashay; Qin, Yilong; Dadachev, Boris; Papineni, Kishore; Kumar, Sanjiv; Risteski, Andrej (July 2024, International Conference on Machine Learning (ICML), 2024)

Autoregressive language models are the currently dominant paradigm for text generation, but they have some fundamental limitations that cannot be remedied by scale—for example inherently sequential and unidirectional generation. While alternate classes of models have been explored, we have limited mathematical understanding of their fundamental power and limitations. In this paper we focus on Generative Masked Language Models (GMLMs), a non-autoregressive paradigm in which we train a model to fit conditional probabilities of the data distribution via masking, which are subsequently used as inputs to a Markov Chain to draw samples from the model. These models empirically strike a promising speed-quality tradeoff as each step can be typically parallelized by decoding the entire sequence in parallel. We develop a mathematical framework for analyzing and improving such models which sheds light on questions of sample complexity and inference speed and quality. Empirically, we adapt the T5 model for iteratively-refined parallel decoding, achieving 2-3x speedup in machine translation with minimal sacrifice in quality compared with autoregressive models. We run careful ablation experiments to give recommendations on key design choices, and make fine-grained observations on the common error modes in connection with our theory. Our mathematical analyses and empirical observations characterize both potentials and limitations of this approach, and can be applied to future works on improving understanding and performance of GMLMs.
more » « less
Full Text Available
Human imprint of water withdrawals in the wet environment: A case study of declining groundwater in Georgia, USA

https://doi.org/10.1016/j.ejrh.2021.100813

Sutton, Collin; Kumar, Sanjiv; Lee, Ming-Kuo; Davis, Eian (June 2021, Journal of Hydrology: Regional Studies)
null (Ed.)
Full Text Available
Overcoming barriers to enable convergence research by integrating ecological and climate sciences: the NCAR–NEON system Version 1

https://doi.org/10.5194/gmd-16-5979-2023

Lombardozzi, Danica L.; Wieder, William R.; Sobhani, Negin; Bonan, Gordon B.; Durden, David; Lenz, Dawn; SanClements, Michael; Weintraub-Leff, Samantha; Ayres, Edward; Florian, Christopher R.; et al (January 2023, Geoscientific Model Development)

Abstract. Global change research demands a convergence among academic disciplines to understand complex changes in Earth system function. Limitations related to data usability and computing infrastructure, however, present barriers to effective use of the research tools needed for this cross-disciplinary collaboration. To address these barriers, we created a computational platform that pairs meteorological data and site-level ecosystem characterizations from the National Ecological Observatory Network (NEON) with the Community Terrestrial System Model (CTSM) that is developed with university partners at the National Center for Atmospheric Research (NCAR). This NCAR–NEON system features a simplified user interface that facilitates access to and use of NEON observations and NCAR models. We present preliminary results that compare observed NEON fluxes with CTSM simulations and describe how the collaboration between NCAR and NEON that can be used by the global change research community improves both the data and model. Beyond datasets and computing, the NCAR–NEON system includes tutorials and visualization tools that facilitate interaction with observational and model datasets and further enable opportunities for teaching and research. By expanding access to data, models, and computing, cyberinfrastructure tools like the NCAR–NEON system will accelerate integration across ecology and climate science disciplines to advance understanding in Earth system science and global change.
more » « less
Full Text Available
COPING WITH LABEL SHIFT VIA DISTRIBUTIONALLY ROBUST OPTIMISATION

Zhang, Jingzhao; Menon, Aditya Krishna; Veit, Andreas; Bhojanapalli, Srinadh; Kumar, Sanjiv; Sra, Suvrit (January 2021, International Conference on Learning Representations (ICLR))

Full Text Available
Evaluations and Methods for Explanation through Robustness Analysis

Hsieh, Cheng-Yu; Yeh, Chih-Kuan; Liu, Xuanqing; Ravikumar, Pradeep; Kim, Seungyeon; Kumar, Sanjiv; Hsieh, Cho-Jui (January 2021, International Conference on Learning Representation (ICLR))

Full Text Available
Why are Adaptive Methods Good for Attention Models?

Zhang, Jingzhao; Karimireddy, Sai Praneeth; Veit, Andreas; Kim, Seungyeon; Reddi, Sashank; Kumar, Sanjiv (January 2020, 34th Conference on Neural Information Processing Systems (NeurIPS 2020))

Full Text Available
Learning to Screen for Fast Softmax Inference on Large Vocabulary Neural Networks

Chen, Patrick H; Si, Si; Kumar, Sanjiv; Li, Yang; Hsieh, Cho-Jui (April 2019, International conference on learning representation (ICLR))

Full Text Available
Learning a Compressed Sensing Measurement Matrix via Gradient Unrolling

Wu, Shanshan; Dimakis, Alexandros; Sanghavi, Sujay; Yu, Felix; Holtmann-Rice, Daniel; Storcheus, Dmitry; Rostamizadeh, Afshin; Kumar, Sanjiv (June 2019, International Conference on Machine Learning (ICML))

Linear encoding of sparse vectors is widely popular, but is commonly data-independent – missing any possible extra (but a priori unknown) structure beyond sparsity. In this paper we present a new method to learn linear encoders that adapt to data, while still performing well with the widely used l1 decoder. The convex l1 decoder prevents gradient propagation as needed in standard gradient-based training. Our method is based on the insight that unrolling the convex decoder into T projected subgradient steps can address this issue. Our method can be seen as a data-driven way to learn a compressed sensing measurement matrix. We compare the empirical performance of 10 algorithms over 6 sparse datasets (3 synthetic and 3 real). Our experiments show that there is indeed additional structure beyond sparsity in the real datasets; our method is able to discover it and exploit it to create excellent reconstructions with fewer measurements (by a factor of 1.1-3x) compared to the previous state-of-the-art methods. We illustrate an application of our method in learning label embeddings for extreme multi-label classification, and empirically show that our method is able to match or outperform the precision scores of SLEEC, which is one of the state-of-the-art embedding-based approaches.
more » « less
Full Text Available

« Prev Next »

Search for: All records